Overview

Dataset statistics

Number of variables30
Number of observations26996
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 MiB
Average record size in memory240.0 B

Variable types

Numeric13
Categorical17

Alerts

user is highly correlated with cc_recommended and 1 other fieldsHigh correlation
deposits is highly correlated with withdrawal and 2 other fieldsHigh correlation
withdrawal is highly correlated with deposits and 1 other fieldsHigh correlation
purchases_partners is highly correlated with deposits and 2 other fieldsHigh correlation
purchases is highly correlated with deposits and 2 other fieldsHigh correlation
cc_recommended is highly correlated with user and 3 other fieldsHigh correlation
cc_application_begin is highly correlated with cc_recommended and 1 other fieldsHigh correlation
web_user is highly correlated with app_web_userHigh correlation
app_web_user is highly correlated with web_userHigh correlation
ios_user is highly correlated with android_userHigh correlation
android_user is highly correlated with ios_userHigh correlation
reward_rate is highly correlated with user and 2 other fieldsHigh correlation
user is highly correlated with cc_recommendedHigh correlation
deposits is highly correlated with purchasesHigh correlation
purchases_partners is highly correlated with cc_recommendedHigh correlation
purchases is highly correlated with depositsHigh correlation
cc_recommended is highly correlated with user and 3 other fieldsHigh correlation
cc_application_begin is highly correlated with cc_recommended and 1 other fieldsHigh correlation
web_user is highly correlated with app_web_userHigh correlation
app_web_user is highly correlated with web_userHigh correlation
ios_user is highly correlated with android_userHigh correlation
android_user is highly correlated with ios_userHigh correlation
reward_rate is highly correlated with cc_recommended and 1 other fieldsHigh correlation
deposits is highly correlated with withdrawal and 1 other fieldsHigh correlation
withdrawal is highly correlated with deposits and 1 other fieldsHigh correlation
purchases is highly correlated with deposits and 1 other fieldsHigh correlation
cc_recommended is highly correlated with cc_application_begin and 1 other fieldsHigh correlation
cc_application_begin is highly correlated with cc_recommended and 1 other fieldsHigh correlation
web_user is highly correlated with app_web_userHigh correlation
app_web_user is highly correlated with web_userHigh correlation
ios_user is highly correlated with android_userHigh correlation
android_user is highly correlated with ios_userHigh correlation
reward_rate is highly correlated with cc_recommended and 1 other fieldsHigh correlation
ios_user is highly correlated with android_userHigh correlation
android_user is highly correlated with ios_userHigh correlation
web_user is highly correlated with app_web_userHigh correlation
app_web_user is highly correlated with web_userHigh correlation
user is highly correlated with app_downloaded and 1 other fieldsHigh correlation
deposits is highly correlated with purchasesHigh correlation
purchases is highly correlated with depositsHigh correlation
cc_recommended is highly correlated with reward_rateHigh correlation
app_downloaded is highly correlated with userHigh correlation
web_user is highly correlated with app_web_userHigh correlation
app_web_user is highly correlated with web_userHigh correlation
ios_user is highly correlated with android_userHigh correlation
android_user is highly correlated with ios_userHigh correlation
reward_rate is highly correlated with user and 1 other fieldsHigh correlation
cc_disliked is highly skewed (γ1 = 54.97976002) Skewed
cc_liked is highly skewed (γ1 = 64.8845572) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
deposits has 18152 (67.2%) zeros Zeros
withdrawal has 22991 (85.2%) zeros Zeros
purchases_partners has 8652 (32.0%) zeros Zeros
purchases has 18291 (67.8%) zeros Zeros
cc_taken has 25701 (95.2%) zeros Zeros
cc_recommended has 3763 (13.9%) zeros Zeros
cc_disliked has 26440 (97.9%) zeros Zeros
cc_liked has 26766 (99.1%) zeros Zeros
cc_application_begin has 7424 (27.5%) zeros Zeros
reward_rate has 3595 (13.3%) zeros Zeros

Reproduction

Analysis started2022-12-13 07:15:29.840927
Analysis finished2022-12-13 07:16:01.337560
Duration31.5 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct26996
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13500.26878
Minimum0
Maximum26999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile1349.75
Q16750.75
median13500.5
Q320250.25
95-th percentile25649.25
Maximum26999
Range26999
Interquartile range (IQR)13499.5

Descriptive statistics

Standard deviation7794.356275
Coefficient of variation (CV)0.5773482293
Kurtosis-1.200023109
Mean13500.26878
Median Absolute Deviation (MAD)6750
Skewness-8.408521038 × 10-5
Sum364453256
Variance60751989.75
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
259581
 
< 0.1%
74651
 
< 0.1%
13221
 
< 0.1%
33711
 
< 0.1%
136121
 
< 0.1%
156611
 
< 0.1%
95181
 
< 0.1%
115671
 
< 0.1%
218241
 
< 0.1%
Other values (26986)26986
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
269991
< 0.1%
269981
< 0.1%
269971
< 0.1%
269961
< 0.1%
269951
< 0.1%
269941
< 0.1%
269931
< 0.1%
269921
< 0.1%
269911
< 0.1%
269901
< 0.1%

user
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct24737
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35418.5353
Minimum1
Maximum69658
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum1
5-th percentile3401.75
Q117808.75
median35745.5
Q353236.75
95-th percentile66432.25
Maximum69658
Range69657
Interquartile range (IQR)35428

Descriptive statistics

Standard deviation20319.62035
Coefficient of variation (CV)0.5737001878
Kurtosis-1.218291786
Mean35418.5353
Median Absolute Deviation (MAD)17757
Skewness-0.04003022018
Sum956158779
Variance412886971.3
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
641594
 
< 0.1%
580724
 
< 0.1%
156124
 
< 0.1%
349334
 
< 0.1%
219754
 
< 0.1%
563274
 
< 0.1%
642414
 
< 0.1%
545164
 
< 0.1%
244904
 
< 0.1%
433424
 
< 0.1%
Other values (24727)26956
99.9%
ValueCountFrequency (%)
11
< 0.1%
41
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
131
< 0.1%
141
< 0.1%
181
< 0.1%
191
< 0.1%
ValueCountFrequency (%)
696581
< 0.1%
696551
< 0.1%
696531
< 0.1%
696511
< 0.1%
696501
< 0.1%
696491
< 0.1%
696471
< 0.1%
696441
< 0.1%
696431
< 0.1%
696401
< 0.1%

churn
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
15822 
1
11174 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
015822
58.6%
111174
41.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
015822
58.6%
111174
41.4%

Most occurring characters

ValueCountFrequency (%)
015822
58.6%
111174
41.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
015822
58.6%
111174
41.4%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
015822
58.6%
111174
41.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
015822
58.6%
111174
41.4%

age
Real number (ℝ≥0)

Distinct73
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.21992147
Minimum17
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum17
5-th percentile20
Q125
median30
Q337
95-th percentile52
Maximum91
Range74
Interquartile range (IQR)12

Descriptive statistics

Standard deviation9.964837841
Coefficient of variation (CV)0.3092756713
Kurtosis1.570692416
Mean32.21992147
Median Absolute Deviation (MAD)6
Skewness1.180387561
Sum869809
Variance99.29799319
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
261440
 
5.3%
251405
 
5.2%
271371
 
5.1%
241370
 
5.1%
281327
 
4.9%
231282
 
4.7%
291267
 
4.7%
301155
 
4.3%
221122
 
4.2%
311060
 
3.9%
Other values (63)14197
52.6%
ValueCountFrequency (%)
1764
 
0.2%
18215
 
0.8%
19443
 
1.6%
20780
2.9%
21987
3.7%
221122
4.2%
231282
4.7%
241370
5.1%
251405
5.2%
261440
5.3%
ValueCountFrequency (%)
912
< 0.1%
891
 
< 0.1%
881
 
< 0.1%
871
 
< 0.1%
854
< 0.1%
842
< 0.1%
831
 
< 0.1%
821
 
< 0.1%
811
 
< 0.1%
802
< 0.1%

housing
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
na
13856 
R
10969 
O
2171 

Length

Max length2
Median length2
Mean length1.513261224
Min length1

Characters and Unicode

Total characters40852
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowna
2nd rowR
3rd rowR
4th rowR
5th rowna

Common Values

ValueCountFrequency (%)
na13856
51.3%
R10969
40.6%
O2171
 
8.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
na13856
51.3%
r10969
40.6%
o2171
 
8.0%

Most occurring characters

ValueCountFrequency (%)
a13856
33.9%
n13856
33.9%
R10969
26.9%
O2171
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter27712
67.8%
Uppercase Letter13140
32.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a13856
50.0%
n13856
50.0%
Uppercase Letter
ValueCountFrequency (%)
R10969
83.5%
O2171
 
16.5%

Most occurring scripts

ValueCountFrequency (%)
Latin40852
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a13856
33.9%
n13856
33.9%
R10969
26.9%
O2171
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII40852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a13856
33.9%
n13856
33.9%
R10969
26.9%
O2171
 
5.3%

deposits
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct66
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.342050674
Minimum0
Maximum65
Zeros18152
Zeros (%)67.2%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile22
Maximum65
Range65
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.131991934
Coefficient of variation (CV)2.732451666
Kurtosis15.35084463
Mean3.342050674
Median Absolute Deviation (MAD)0
Skewness3.798010992
Sum90222
Variance83.39327668
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018152
67.2%
12461
 
9.1%
21085
 
4.0%
3679
 
2.5%
4454
 
1.7%
5401
 
1.5%
6328
 
1.2%
7266
 
1.0%
8207
 
0.8%
9193
 
0.7%
Other values (56)2770
 
10.3%
ValueCountFrequency (%)
018152
67.2%
12461
 
9.1%
21085
 
4.0%
3679
 
2.5%
4454
 
1.7%
5401
 
1.5%
6328
 
1.2%
7266
 
1.0%
8207
 
0.8%
9193
 
0.7%
ValueCountFrequency (%)
651
 
< 0.1%
642
 
< 0.1%
632
 
< 0.1%
627
 
< 0.1%
614
 
< 0.1%
6026
0.1%
5920
0.1%
5819
0.1%
5726
0.1%
5622
0.1%

withdrawal
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct23
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3070454882
Minimum0
Maximum29
Zeros22991
Zeros (%)85.2%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum29
Range29
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.055487779
Coefficient of variation (CV)3.437561599
Kurtosis95.87654918
Mean0.3070454882
Median Absolute Deviation (MAD)0
Skewness7.323229548
Sum8289
Variance1.114054453
MonotonicityNot monotonic
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
022991
85.2%
12203
 
8.2%
2853
 
3.2%
3434
 
1.6%
4205
 
0.8%
5108
 
0.4%
665
 
0.2%
739
 
0.1%
833
 
0.1%
918
 
0.1%
Other values (13)47
 
0.2%
ValueCountFrequency (%)
022991
85.2%
12203
 
8.2%
2853
 
3.2%
3434
 
1.6%
4205
 
0.8%
5108
 
0.4%
665
 
0.2%
739
 
0.1%
833
 
0.1%
918
 
0.1%
ValueCountFrequency (%)
291
 
< 0.1%
281
 
< 0.1%
241
 
< 0.1%
201
 
< 0.1%
191
 
< 0.1%
171
 
< 0.1%
163
< 0.1%
154
< 0.1%
141
 
< 0.1%
135
< 0.1%

purchases_partners
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct294
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.06667654
Minimum0
Maximum1067
Zeros8652
Zeros (%)32.0%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median9
Q343
95-th percentile108
Maximum1067
Range1067
Interquartile range (IQR)43

Descriptive statistics

Standard deviation42.22143182
Coefficient of variation (CV)1.50432602
Kurtosis33.21208618
Mean28.06667654
Median Absolute Deviation (MAD)9
Skewness3.449851935
Sum757688
Variance1782.649305
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
08652
32.0%
11093
 
4.0%
2812
 
3.0%
3727
 
2.7%
4554
 
2.1%
5501
 
1.9%
6387
 
1.4%
7386
 
1.4%
9316
 
1.2%
8298
 
1.1%
Other values (284)13270
49.2%
ValueCountFrequency (%)
08652
32.0%
11093
 
4.0%
2812
 
3.0%
3727
 
2.7%
4554
 
2.1%
5501
 
1.9%
6387
 
1.4%
7386
 
1.4%
8298
 
1.1%
9316
 
1.2%
ValueCountFrequency (%)
10671
< 0.1%
8331
< 0.1%
7091
< 0.1%
6941
< 0.1%
6121
< 0.1%
5241
< 0.1%
4901
< 0.1%
4472
< 0.1%
4461
< 0.1%
4081
< 0.1%

purchases
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct64
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.273966514
Minimum0
Maximum63
Zeros18291
Zeros (%)67.8%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile22
Maximum63
Range63
Interquartile range (IQR)1

Descriptive statistics

Standard deviation8.953651403
Coefficient of variation (CV)2.734802377
Kurtosis15.27079914
Mean3.273966514
Median Absolute Deviation (MAD)0
Skewness3.79020009
Sum88384
Variance80.16787345
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018291
67.8%
12388
 
8.8%
21059
 
3.9%
3678
 
2.5%
4445
 
1.6%
5403
 
1.5%
6327
 
1.2%
7262
 
1.0%
8210
 
0.8%
9196
 
0.7%
Other values (54)2737
 
10.1%
ValueCountFrequency (%)
018291
67.8%
12388
 
8.8%
21059
 
3.9%
3678
 
2.5%
4445
 
1.6%
5403
 
1.5%
6327
 
1.2%
7262
 
1.0%
8210
 
0.8%
9196
 
0.7%
ValueCountFrequency (%)
631
 
< 0.1%
621
 
< 0.1%
613
 
< 0.1%
6015
 
0.1%
5913
 
< 0.1%
5823
0.1%
5719
0.1%
5619
0.1%
5543
0.2%
5421
0.1%

cc_taken
Real number (ℝ≥0)

ZEROS

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.07378870944
Minimum0
Maximum29
Zeros25701
Zeros (%)95.2%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum29
Range29
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4373306733
Coefficient of variation (CV)5.926796615
Kurtosis792.3324864
Mean0.07378870944
Median Absolute Deviation (MAD)0
Skewness17.58909071
Sum1992
Variance0.1912581178
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
025701
95.2%
1923
 
3.4%
2218
 
0.8%
375
 
0.3%
445
 
0.2%
516
 
0.1%
611
 
< 0.1%
102
 
< 0.1%
72
 
< 0.1%
291
 
< 0.1%
Other values (2)2
 
< 0.1%
ValueCountFrequency (%)
025701
95.2%
1923
 
3.4%
2218
 
0.8%
375
 
0.3%
445
 
0.2%
516
 
0.1%
611
 
< 0.1%
72
 
< 0.1%
81
 
< 0.1%
102
 
< 0.1%
ValueCountFrequency (%)
291
 
< 0.1%
111
 
< 0.1%
102
 
< 0.1%
81
 
< 0.1%
72
 
< 0.1%
611
 
< 0.1%
516
 
0.1%
445
 
0.2%
375
 
0.3%
2218
0.8%

cc_recommended
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct325
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean92.63950215
Minimum0
Maximum522
Zeros3763
Zeros (%)13.9%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q110
median65
Q3164
95-th percentile261
Maximum522
Range522
Interquartile range (IQR)154

Descriptive statistics

Standard deviation88.8687734
Coefficient of variation (CV)0.9592967508
Kurtosis-0.7894641265
Mean92.63950215
Median Absolute Deviation (MAD)62
Skewness0.6773721605
Sum2500896
Variance7897.658886
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03763
 
13.9%
5757
 
2.8%
1440
 
1.6%
10401
 
1.5%
6338
 
1.3%
15318
 
1.2%
4276
 
1.0%
20243
 
0.9%
2232
 
0.9%
11228
 
0.8%
Other values (315)20000
74.1%
ValueCountFrequency (%)
03763
13.9%
1440
 
1.6%
2232
 
0.9%
3211
 
0.8%
4276
 
1.0%
5757
 
2.8%
6338
 
1.3%
7158
 
0.6%
8169
 
0.6%
9184
 
0.7%
ValueCountFrequency (%)
5221
 
< 0.1%
3261
 
< 0.1%
3241
 
< 0.1%
3232
 
< 0.1%
3221
 
< 0.1%
3212
 
< 0.1%
3205
< 0.1%
3181
 
< 0.1%
3161
 
< 0.1%
3153
< 0.1%

cc_disliked
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct20
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.05063713143
Minimum0
Maximum65
Zeros26440
Zeros (%)97.9%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum65
Range65
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.871430837
Coefficient of variation (CV)17.20932471
Kurtosis3703.501684
Mean0.05063713143
Median Absolute Deviation (MAD)0
Skewness54.97976002
Sum1367
Variance0.7593917037
MonotonicityNot monotonic
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
026440
97.9%
1363
 
1.3%
278
 
0.3%
436
 
0.1%
334
 
0.1%
511
 
< 0.1%
69
 
< 0.1%
83
 
< 0.1%
153
 
< 0.1%
93
 
< 0.1%
Other values (10)16
 
0.1%
ValueCountFrequency (%)
026440
97.9%
1363
 
1.3%
278
 
0.3%
334
 
0.1%
436
 
0.1%
511
 
< 0.1%
69
 
< 0.1%
72
 
< 0.1%
83
 
< 0.1%
93
 
< 0.1%
ValueCountFrequency (%)
651
 
< 0.1%
621
 
< 0.1%
592
< 0.1%
251
 
< 0.1%
231
 
< 0.1%
153
< 0.1%
131
 
< 0.1%
122
< 0.1%
113
< 0.1%
102
< 0.1%

cc_liked
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.01311305379
Minimum0
Maximum27
Zeros26766
Zeros (%)99.1%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum27
Range27
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2381752869
Coefficient of variation (CV)18.16322046
Kurtosis6449.922431
Mean0.01311305379
Median Absolute Deviation (MAD)0
Skewness64.8845572
Sum354
Variance0.05672746727
MonotonicityNot monotonic
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
026766
99.1%
1182
 
0.7%
226
 
0.1%
311
 
< 0.1%
46
 
< 0.1%
92
 
< 0.1%
271
 
< 0.1%
101
 
< 0.1%
81
 
< 0.1%
ValueCountFrequency (%)
026766
99.1%
1182
 
0.7%
226
 
0.1%
311
 
< 0.1%
46
 
< 0.1%
81
 
< 0.1%
92
 
< 0.1%
101
 
< 0.1%
271
 
< 0.1%
ValueCountFrequency (%)
271
 
< 0.1%
101
 
< 0.1%
92
 
< 0.1%
81
 
< 0.1%
46
 
< 0.1%
311
 
< 0.1%
226
 
0.1%
1182
 
0.7%
026766
99.1%

cc_application_begin
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct128
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.199066528
Minimum0
Maximum263
Zeros7424
Zeros (%)27.5%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median4
Q311
95-th percentile31
Maximum263
Range263
Interquartile range (IQR)11

Descriptive statistics

Standard deviation12.49777045
Coefficient of variation (CV)1.52429187
Kurtosis29.36777685
Mean8.199066528
Median Absolute Deviation (MAD)4
Skewness3.855547005
Sum221342
Variance156.1942662
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07424
27.5%
12293
 
8.5%
21895
 
7.0%
31495
 
5.5%
41350
 
5.0%
51205
 
4.5%
61050
 
3.9%
7905
 
3.4%
8847
 
3.1%
9785
 
2.9%
Other values (118)7747
28.7%
ValueCountFrequency (%)
07424
27.5%
12293
 
8.5%
21895
 
7.0%
31495
 
5.5%
41350
 
5.0%
51205
 
4.5%
61050
 
3.9%
7905
 
3.4%
8847
 
3.1%
9785
 
2.9%
ValueCountFrequency (%)
2631
< 0.1%
2301
< 0.1%
1841
< 0.1%
1831
< 0.1%
1611
< 0.1%
1482
< 0.1%
1472
< 0.1%
1402
< 0.1%
1381
< 0.1%
1361
< 0.1%

app_downloaded
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
1
25714 
0
 
1282

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

Most occurring characters

ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
125714
95.3%
01282
 
4.7%

web_user
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
1
16364 
0
10632 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
116364
60.6%
010632
39.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
116364
60.6%
010632
39.4%

Most occurring characters

ValueCountFrequency (%)
116364
60.6%
010632
39.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
116364
60.6%
010632
39.4%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
116364
60.6%
010632
39.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
116364
60.6%
010632
39.4%

app_web_user
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
1
15167 
0
11829 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
115167
56.2%
011829
43.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
115167
56.2%
011829
43.8%

Most occurring characters

ValueCountFrequency (%)
115167
56.2%
011829
43.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
115167
56.2%
011829
43.8%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
115167
56.2%
011829
43.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
115167
56.2%
011829
43.8%

ios_user
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
16360 
1
10636 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
016360
60.6%
110636
39.4%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
016360
60.6%
110636
39.4%

Most occurring characters

ValueCountFrequency (%)
016360
60.6%
110636
39.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
016360
60.6%
110636
39.4%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
016360
60.6%
110636
39.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
016360
60.6%
110636
39.4%

android_user
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
1
15853 
0
11143 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
115853
58.7%
011143
41.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
115853
58.7%
011143
41.3%

Most occurring characters

ValueCountFrequency (%)
115853
58.7%
011143
41.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
115853
58.7%
011143
41.3%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
115853
58.7%
011143
41.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
115853
58.7%
011143
41.3%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
21956 
2
4048 
3
 
754
4
 
183
5
 
55

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row2
4th row0
5th row0

Common Values

ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

Most occurring characters

ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
021956
81.3%
24048
 
15.0%
3754
 
2.8%
4183
 
0.7%
555
 
0.2%

payment_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
Bi-Weekly
12713 
Weekly
5289 
na
3899 
Monthly
2656 
Semi-Monthly
2439 

Length

Max length12
Median length9
Mean length7.475514891
Min length2

Characters and Unicode

Total characters201809
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBi-Weekly
2nd rowWeekly
3rd rowSemi-Monthly
4th rowBi-Weekly
5th rowBi-Weekly

Common Values

ValueCountFrequency (%)
Bi-Weekly12713
47.1%
Weekly5289
19.6%
na3899
 
14.4%
Monthly2656
 
9.8%
Semi-Monthly2439
 
9.0%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
bi-weekly12713
47.1%
weekly5289
19.6%
na3899
 
14.4%
monthly2656
 
9.8%
semi-monthly2439
 
9.0%

Most occurring characters

ValueCountFrequency (%)
e38443
19.0%
y23097
11.4%
l23097
11.4%
k18002
8.9%
W18002
8.9%
-15152
 
7.5%
i15152
 
7.5%
B12713
 
6.3%
n8994
 
4.5%
h5095
 
2.5%
Other values (6)24062
11.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter148408
73.5%
Uppercase Letter38249
 
19.0%
Dash Punctuation15152
 
7.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e38443
25.9%
y23097
15.6%
l23097
15.6%
k18002
12.1%
i15152
 
10.2%
n8994
 
6.1%
h5095
 
3.4%
t5095
 
3.4%
o5095
 
3.4%
a3899
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
W18002
47.1%
B12713
33.2%
M5095
 
13.3%
S2439
 
6.4%
Dash Punctuation
ValueCountFrequency (%)
-15152
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin186657
92.5%
Common15152
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e38443
20.6%
y23097
12.4%
l23097
12.4%
k18002
9.6%
W18002
9.6%
i15152
 
8.1%
B12713
 
6.8%
n8994
 
4.8%
h5095
 
2.7%
t5095
 
2.7%
Other values (5)18967
10.2%
Common
ValueCountFrequency (%)
-15152
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII201809
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e38443
19.0%
y23097
11.4%
l23097
11.4%
k18002
8.9%
W18002
8.9%
-15152
 
7.5%
i15152
 
7.5%
B12713
 
6.3%
n8994
 
4.5%
h5095
 
2.5%
Other values (6)24062
11.9%

waiting_4_loan
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
26961 
1
 
35

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

Most occurring characters

ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
026961
99.9%
135
 
0.1%

cancelled_loan
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
26488 
1
 
508

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

Most occurring characters

ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
026488
98.1%
1508
 
1.9%

received_loan
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
26505 
1
 
491

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

Most occurring characters

ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
026505
98.2%
1491
 
1.8%

rejected_loan
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
26864 
1
 
132

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

Most occurring characters

ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
026864
99.5%
1132
 
0.5%

zodiac_sign
Categorical

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
Cancer
2424 
Virgo
2410 
Leo
2374 
Taurus
2236 
Gemini
2168 
Other values (8)
15384 

Length

Max length11
Median length9
Mean length5.866535783
Min length2

Characters and Unicode

Total characters158373
Distinct characters23
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLeo
2nd rowLeo
3rd rowCapricorn
4th rowCapricorn
5th rowAries

Common Values

ValueCountFrequency (%)
Cancer2424
9.0%
Virgo2410
8.9%
Leo2374
8.8%
Taurus2236
8.3%
Gemini2168
8.0%
na2155
8.0%
Libra2128
7.9%
Pisces2127
7.9%
Scorpio2118
7.8%
Aquarius2117
7.8%
Other values (3)4739
17.6%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
cancer2424
9.0%
virgo2410
8.9%
leo2374
8.8%
taurus2236
8.3%
gemini2168
8.0%
na2155
8.0%
libra2128
7.9%
pisces2127
7.9%
scorpio2118
7.8%
aquarius2117
7.8%
Other values (3)4739
17.6%

Most occurring characters

ValueCountFrequency (%)
i22031
13.9%
r18854
11.9%
a15854
10.0%
s12664
 
8.0%
e11094
 
7.0%
u10762
 
6.8%
o9702
 
6.1%
n7429
 
4.7%
c7351
 
4.6%
L4502
 
2.8%
Other values (13)38130
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter133532
84.3%
Uppercase Letter24841
 
15.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i22031
16.5%
r18854
14.1%
a15854
11.9%
s12664
9.5%
e11094
8.3%
u10762
8.1%
o9702
7.3%
n7429
 
5.6%
c7351
 
5.5%
g4466
 
3.3%
Other values (5)13325
10.0%
Uppercase Letter
ValueCountFrequency (%)
L4502
18.1%
S4174
16.8%
A4118
16.6%
C3106
12.5%
V2410
9.7%
T2236
9.0%
G2168
8.7%
P2127
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin158373
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i22031
13.9%
r18854
11.9%
a15854
10.0%
s12664
 
8.0%
e11094
 
7.0%
u10762
 
6.8%
o9702
 
6.1%
n7429
 
4.7%
c7351
 
4.6%
L4502
 
2.8%
Other values (13)38130
24.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII158373
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i22031
13.9%
r18854
11.9%
a15854
10.0%
s12664
 
8.0%
e11094
 
7.0%
u10762
 
6.8%
o9702
 
6.1%
n7429
 
4.7%
c7351
 
4.6%
L4502
 
2.8%
Other values (13)38130
24.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
22313 
1
4683 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%

Most occurring characters

ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
022313
82.7%
14683
 
17.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
26508 
1
 
488

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

Most occurring characters

ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
026508
98.2%
1488
 
1.8%

reward_rate
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct193
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9078189361
Minimum0
Maximum4
Zeros3595
Zeros (%)13.3%
Negative0
Negative (%)0.0%
Memory size211.0 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10.2
median0.78
Q31.53
95-th percentile2.2
Maximum4
Range4
Interquartile range (IQR)1.33

Descriptive statistics

Standard deviation0.7519905542
Coefficient of variation (CV)0.8283486104
Kurtosis-0.7558210633
Mean0.9078189361
Median Absolute Deviation (MAD)0.65
Skewness0.4945052192
Sum24507.48
Variance0.5654897936
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03595
 
13.3%
0.07802
 
3.0%
0.03523
 
1.9%
0.5523
 
1.9%
0.13493
 
1.8%
1493
 
1.8%
2.23451
 
1.7%
0.1411
 
1.5%
0.67407
 
1.5%
0.33376
 
1.4%
Other values (183)18922
70.1%
ValueCountFrequency (%)
03595
13.3%
0.03523
 
1.9%
0.0459
 
0.2%
0.0549
 
0.2%
0.0641
 
0.2%
0.07802
 
3.0%
0.0852
 
0.2%
0.0927
 
0.1%
0.1411
 
1.5%
0.1139
 
0.1%
ValueCountFrequency (%)
433
0.1%
3.81
 
< 0.1%
3.671
 
< 0.1%
3.61
 
< 0.1%
3.531
 
< 0.1%
3.52
 
< 0.1%
3.33
 
< 0.1%
3.181
 
< 0.1%
315
0.1%
2.931
 
< 0.1%

is_referred
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size211.0 KiB
0
18411 
1
8585 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters26996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Length

Histogram of lengths of the category

Category Frequency Plot

ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Most occurring characters

ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number26996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Most occurring scripts

ValueCountFrequency (%)
Common26996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII26996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
018411
68.2%
18585
31.8%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexuserchurnagehousingdepositswithdrawalpurchases_partnerspurchasescc_takencc_recommendedcc_dislikedcc_likedcc_application_beginapp_downloadedweb_userapp_web_userios_userandroid_userregistered_phonespayment_typewaiting_4_loancancelled_loanreceived_loanrejected_loanzodiac_signleft_for_two_month_plusleft_for_one_monthreward_rateis_referred
0055409037.0na000000000111100Bi-Weekly0000Leo100.000
1123547028.0R0010096005111100Weekly0000Leo001.471
2258313035.0R47286470285009100012Semi-Monthly0000Capricorn102.170
338095026.0R26338250740026100010Bi-Weekly0000Capricorn001.101
4461353127.0na002000000111010Bi-Weekly0000Aries100.030
553120132.0R53111502270017111010Bi-Weekly0000Taurus001.830
6641406021.0na004000000100010Bi-Weekly0000Cancer000.070
7767679024.0na002000000100010na0000Leo000.110
8821269028.0R0000247109111100Bi-Weekly0000Sagittarius000.871
9925788023.0na108710125003111100Bi-Weekly0000Aquarius001.070

Last rows

df_indexuserchurnagehousingdepositswithdrawalpurchases_partnerspurchasescc_takencc_recommendedcc_dislikedcc_likedcc_application_beginapp_downloadedweb_userapp_web_userios_userandroid_userregistered_phonespayment_typewaiting_4_loancancelled_loanreceived_loanrejected_loanzodiac_signleft_for_two_month_plusleft_for_one_monthreward_rateis_referred
269862699032870029.0R1170101470046100100Bi-Weekly0000Cancer001.071
269872699149367030.0na002004000100010Bi-Weekly0000Leo000.030
269882699265830120.0R002005000111010Bi-Weekly0000Leo100.130
269892699341813029.0na115105000100010Bi-Weekly0000Scorpio100.030
269902699449903128.0R00260031000100010Monthly0000Virgo000.600
269912699524291124.0R0000081002111012Weekly0000Leo001.071
26992269964116126.0na002001000111010Bi-Weekly0001Cancer100.670
269932699723740022.0na00370098000111010Bi-Weekly0000Taurus000.930
269942699847663146.0na20162058002111100Semi-Monthly0000Aries100.901
269952699952752134.0na0040011000100112na0000Cancer000.130